Malay Grapheme to Phoneme Tool for Automatic Speech Recognition
نویسندگان
چکیده
This paper presents the design and performance of a Malay grapheme to phoneme (G2P) tool for generating the pronunciation dictionary for a Malay automatic speech recognition system (ASR). The G2P tool is a rule based system. It is flexible in adding and removing rules, and handling of English words. The G2P tool also contains morphological and syllable tool, which it uses to determine the pronunciation of a word. Our evaluation results showed that using the pronunciation dictionary that was generated automatically from our G2P tool, our Malay ASR system achieves WER of 16.5%, which is only 1.9% higher compared to the usage of a pronunciation dictionary that are manually verified.
منابع مشابه
Fast Bootstrapping of Grapheme to Phoneme System for Under-resourced Languages - Application to the Iban Language
This paper deals with the fast bootstrapping of Grapheme-to-Phoneme (G2P) conversion system, which is a key module for both automatic speech recognition (ASR), and text-to-speech synthesis (TTS). The idea is to exploit language contact between a local dominant language (Malay) and a very under-resourced language (Iban spoken in Sarawak and in several parts of the Borneo Island) for which no res...
متن کاملSixth International Joint Conference on Natural Language Processing Proceedings of the Fourth Workshop on South and Southeast Asian Natural Language Processing
This paper deals with the fast bootstrapping of Grapheme-to-Phoneme (G2P) conversion system, which is a key module for both automatic speech recognition (ASR), and text-to-speech synthesis (TTS). The idea is to exploit language contact between a local dominant language (Malay) and a very under-resourced language (Iban spoken in Sarawak and in several parts of the Borneo Island) for which no res...
متن کاملGrapheme to phoneme conversion using an SMT system
This paper presents an automatic grapheme to phoneme conversion system that uses statistical machine translation techniques provided by the Moses Toolkit. The generated word pronunciations are employed in the dictionary of an automatic speech recognition system and evaluated using the ESTER 2 French broadcast news corpus. Grapheme to phoneme conversion based on Moses is compared to two other me...
متن کاملIntegrating Thai grapheme based acoustic models into the ML-MIX framework - for language independent and cross-language ASR
Grapheme based speech recognition is a powerful tool for rapidly creating automatic speech recognition (ASR) systems in new languages. For purposes of language independent or cross language speech recognition it is necessary to identify similar models in the different languages involved. For phoneme based multilingual ASR systems this is usually achieved with the help of a language independent ...
متن کاملAllophone-based acoustic modeling for Persian phoneme recognition
Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...
متن کامل